課程資訊
課程名稱
統計學在海洋化學上的應用
The Applications of Statistics in Marine Chemistry 
開課學期
106-1 
授課對象
理學院  海洋化學組  
授課教師
林卉婷 
課號
Ocean5106 
課程識別碼
241EU6030 
班次
 
學分
2.0 
全/半年
半年 
必/選修
選修 
上課時間
星期五1,2(8:10~10:00) 
上課地點
海研115 
備註
本課程以英語授課。若只有台灣學生修課將會中英混用來上課。
總人數上限:10人 
Ceiba 課程網頁
http://ceiba.ntu.edu.tw/1061Ocean5106_ 
課程簡介影片
 
核心能力關聯
核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

This course is designed based on the teaching method of “authentic learning” to guide students to learn about the applications of statistics in marine chemistry. Basic statistics will be introduced with real-world marine chemical data sets. This course is complementary to the mandatory course “NTU Fundamentals of Oceanic Statistics” (基礎海洋統計) and to the Marine Chemistry Laboratory (海洋化學實習) for students in the chemical oceanography division.

A variety of data set from analytical methods including spectrophotometry, chromatography, mass spectrometry, chemiluminescent, fluorimeter, optical sensor and pH sensors, commonly used by chemical oceanographers. While the principles of analysis differs significantly among analytical methods, it is important that students be versatile in dealing with various data set. For example, spectrophotometry is a basic method for the analyses of marine micro nutrients, its sensitivity and stability can be directly estimated based on the extinction coefficient—the intensiveness of the color. The limit of detection is a fixed value. While other instruments such as a mass spectrometer, can be tuned in a various ways to provide a better sensitivities, different labs report different limits of detection. While the cost of an analysis using a spectrophotometer is only 1/1000 of the cost using a mass spectrometer, by using statistic tools and the combination of knowledge in analytical chemistry, students will learn to choose the most suitable method for their research.

The course will start by having students to look for problems in a reported data set. For example, a figure of data points without the report of error bars. Students will have to explore possible ways to estimate the errors. We will then discuss how to design experiments to measure the uncertainties. The course will then provide data from spectrophotometry, chromatography, mass spectrometry, chemiluminescent, fluorimeter, optical sensor and pH sensors, for students to report the data in a statistically acceptable way. For example, students will have to come up with a way to calibrate the instrumental data—converting intensity into meaningful concentrations. Provide a real-world problem for students to solve. For example, what can students do when two analytical instruments such as an oxygen sensor and a colorimetric method do not yield the same concentration?

The final 1/3 of the course, we will explore possible ways to deal with a massive data set. For example, with the advance in mass spectrometry, each sample can be easily analyzed for the concentrations (or intensities) for more than 20 compounds. Are there helpful static methods to help us look for patterns in the variations among samples? The applications of principle component analysis and factor analysis will be introduced/reviewed. Marine chemical data from the literatures or observatory reports will be used for students to practice. At the end of the course, students will be asked to gather data set and to use exploratory factor analysis to explain the correlation/covariance of the data.
 

課程目標
(1) Students will know about basic statistics for chemical oceanographic data.
- Numbers of replicates
- Limit of detection
- Sensitivity
- Error analysis
- Outliers
- Significance tests
- Distribution patterns
- Principle component analysis
- Factor analysis
(2) Students will be able to identify problems in reported data set.
(3) Students will have the capability to use statistical tools to explore their research data.
 
課程要求
This course will be offered in English and thus, students must be able to understand English well enough to enroll. Students are required to discuss and present in English. Chinese may be used occasionally to explain challenging concepts. Students are required to attend ALL classes. No more than two unexcused absences are permitted. Students can use Microsoft Excel, R or Matlab for processing the data. SPSS will be introduced and thus some of the lectures will be given in the computer lab.  
預期每週課後學習時數
 
Office Hours
 
指定閱讀
 
參考書目
1. Methods of seawater analysis (3rd edition) http://onlinelibrary.wiley.com/book/10.1002/9783527613984
2. An introduction to error analysis : the study of uncertainties in physical measurements by Taylor, John R.
3. NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/
 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
Week 1
9/14  Ice-breaking

Why does each student come to this course?
Why do they chose to do research in chemical oceanography?
Why is chemical oceanography important?
Why is statistics powerful for chemical oceanography data set?
What are their past experiences in using statistics?
What statistic tools do the students use before?  
Week 4
10/6  "8:20-10:20"

Data distribution
Standard deviation
Standard deviation and probability
Confidence intervals
Distribution of sample means
t-distribution
normal distribution
Error propagation 
Week 5
10/18  Time to generate your own data set. For those who do not have a current project to generate repeated measurements for the practice of using statistic tools, you may consider generating data following the protocols attached.

Please treasure the pipettors; each pipettor costs more than NT$15,000.  
Week 6
10/24  Mid-term exam 
Week 9
11/9  "3:30-5:20pm"
11. Significance tests
T-test
- One-tailed
- Two-tailed
- Paired
[Q] by students
Students will be asked to come up with questions for their classmates to do significance tests during this course. After students exchange their questions, they will be given 20 mins to work on the tests. Write down their test results and provide the oceanographic meanings of the test.
An example question by Tina’s
[Q] Mira makes a calibration curve for dissolved inorganic carbon using baking soda. The highest standard that she made is 2.50 mmol/L. Then, she measured a reference material purchased from the Dikson’s lab (http://cdiac.ornl.gov/oceans/Dickson_CRM/batches.html). The batch number of this CRM is #116. Mira’s analyzer yield a dissolved inorganic carbon for the CRM to be 2.123 ±0.002 mmol/L. Is the performance of Mira’s analyzer acceptable? Why or why not? What might be a reason for the differences between the reported value and the measured value? What would you recommend Mira to try?  
Week 10
11/10  "8:10-10:00 AM"
More about significance tests

 
Week 13
11/30  "3:30-5:20 AM"

Student presentation of their data that they will present on 12/29. We want to make sure suitable data were chosen for meaningful oceanographical/geocheical discussion using statistical results.

12. Method validation-intercalibration
What does inter-calibration mean?
Why is inter-calibration needed?
How do you perform an inter-calibration?

Assigned reading: Cutter (2013)  
Week 17
12/21  "3:30-05:20 pm"
13. Time-series analysis

https://www.space.ntu.edu.tw/navigate/s/DA809E0D32454EC9BAAEFA0F640E830FQQY

 
Week 17
12/22  "8:10-10:00 AM"
Computer Lab 212

Introduction to Matlab
Wavelet application 
Week 18
12/28  "3:30-05:20 pm"
Computer Lab 206

13. SPSS
- Principle component analysis
- Factor analysis 
Week 20
1/11  "3:30-5:20 PM"
Student presentation:
- What statistical tools you plan to use if other than the reported ones?

Students must duplicate the figures and plots presented in the paper if a published work used.
Students must derive the “oceanographic meaning” with the data set using statistical results.

https://automeris.io/WebPlotDigitizer/ 
Week 20
1/12  8:30-10:00 Review and wrap up. We will finish up this course by discussing real world marine chemistry data. (1) Correlation (2) Anova https://youtu.be/0Vj2V2qRU10
 
Week 1-2
9/15  "8:10-10:00 AM"
1. Review: topics in marine chemistry
- Current researches at IO-NTU
- Published researches by professors from chemistry division at IO-NTU
- Current published research in journals such as “Marine Chemistry” and “Limnology and Oceanography.”
- Look for the statistical methods used in these publications.
2. Review: chemical analytical instruments
- Spectrophotometry
- Chromatography
- Mass spectrometry
- Chemiluminescent
- Fluorimeter
- Optical sensor
- pH sensors
[Q] Where can you find the instrument in IO-NTU or in NTU campus?

3. Students’ past experience in analytical chemistry. Share their data with us. 
Week 2-1
9/21  "3:30-5:20pm"
4. Population versus samples in statistics
[Q] How representing are your “samples” to the population?

5. Numbers of replicates
- Analytical replicates
- Sampling replicates
- Subsampling replicates
- Overall methodological replicates
- Reproducibility
[Q] What is an acceptable number of replicates? Two? Three? Seven? One hundred? One thousand?
[Q] How reproducible are our sampling methods? The data from a hydro-cast of 2017 spring student educational cruise will be used.  
Week 2-2
9/22  "8:10-10:00 AM"
6. Limit of detection and sensitivity
Estimation of the limit of detection of the following analytical methods (not limited to the followings; choose those students are more interested in learning)
- Spectrophotometry for nutrient analysis
- Chromatography-HPLC for amino acid analysi
- Mass spectrometry
AMS
Isotope Ratio MS
Microgeobiology


Laboratory data gathering
Students design experiments to measure errors of the method they choose. Students will be asked to analyze the data and to report the detection limit of their chosen method. For example, a student can choose to obtain the detection limit of a total organic carbon analyzer. The student will be asked to evaluate whether the detection limit changes with injection volume. Another example, a student chose to obtain the detection limit of phosphate analysis using a spectrophotometer. The student will be asked to evaluate whether the detection limit changes with the flow cell light passage length (1 cm versus 5 cm).

[Q] How to determine the best method for an analyte of interest? Which method will you use for measuring inorganic nutrients in the ocean? Trace elements? Elemental ratios? Organics? Dissolve inorganic carbon? pH?

[Q] What is the concentration range of an analyte of interest? (Go through the periodic table). What are the detection limit of each of the method compared with the range of the analyte of interest?
 
Week 3-1
9/28  "3:30-5:20pm"
7. Accuracy
- How to evaluate the accuracy of a measurements?
- What kinds of standards or reference materials are available for oceanographic study?
- Are these standards prepared in seawater? Can we use standards prepared in fresh water?
[Q] How can we design an experiment to check the accuracy of our measurements? If a same sample was measured with two different methods but different concentrations were reported, which one is correct? For example, we can measure the dissolve inorganic carbon concentrations with a head space method with a gas chromatography (NTU Geosciences Li-Hung Lin’s lab) and also with a purging method with a TOC analyzer (IO-NTU 302). What might have happened?

8. Error analysis
- What is an error?
- How does an error occur?
- Estimation of an error
- Error propagation
[Q] How does an analytical error compared with a limit of detection? Does your analytical error decrease with an increased sample size?

9. Significance figure
[Q] Is it meaningful to report a concentration of phosphate 0.32536 ± 0.0223 µM? Why or why not? 
Week 3-2
9/29  "8:10-10:00 AM"
10. Outliers (many methods)
- Dixon’s Q-test
- Grubb’s test (https://graphpad.com/quickcalcs/grubbs2/
- Generalized ESD Test for Outliers

[Q] The Total Organic Carbon (TOC) analyzer can be programmed to automatically remove the extreme values. For example, it can be programmed to continue injection of samples until the data satisfied a set coefficient of variation (CV), also known as relative standard deviation (RSD), be smaller than 1.5%. However, the instrument will exclude the extreme values (highest or lowest) for reporting the mean and deviation. We set to use the 4 values out of maximum of 5 injections and the intensities are 4.78, 4.83, 4.61, 4.42*, 4.67. The instrument excludes 4.42. Do you agree that this value is an outlier? Why or why not?